Evolutionary minimization of the Rand index for speaker clustering
نویسندگان
چکیده
ABSTRACT We propose an effective method for clustering unknown speech utterances based on their associated speakers. The method jointly optimizes the generated clusters and the required number of clusters by estimating and minimizing the Rand index. The metric reflects the clustering errors that arise when utterances from the same speaker are placed in different clusters; or when utterances from different speakers are placed in the same cluster. One useful characteristic of the Rand index is that its value only reaches the minimum when the number of clusters is equal to the size of the true speaker population. We approximate the Rand index by a function of the similarity measures between utterances and then use a genetic algorithm to determine the cluster in which each utterance should be located, such that the function is minimized. Our experiment results show that this novel speaker-clustering method outperforms conventional methods that use the Bayesian information criterion to determine the required number of clusters.
منابع مشابه
Cluster Identification for Speaker
Cluster Identification is introduced as the process of jointly evaluating clustering and labelling schemes for cluster-labelling scheme selection. Normalized Rand and BBN metrics for comparing clustering performances across varied clustering and labelling schemes are presented. The merits of the metrics are evaluated and applied for speaker-environment tracking in Broadcast News.
متن کاملBilateral Weighted Fuzzy C-Means Clustering
Nowadays, the Fuzzy C-Means method has become one of the most popular clustering methods based on minimization of a criterion function. However, the performance of this clustering algorithm may be significantly degraded in the presence of noise. This paper presents a robust clustering algorithm called Bilateral Weighted Fuzzy CMeans (BWFCM). We used a new objective function that uses some k...
متن کاملImproved Automatic Clustering Using a Multi-Objective Evolutionary Algorithm With New Validity measure and application to Credit Scoring
In data mining, clustering is one of the important issues for separation and classification with groups like unsupervised data. In this paper, an attempt has been made to improve and optimize the application of clustering heuristic methods such as Genetic, PSO algorithm, Artificial bee colony algorithm, Harmony Search algorithm and Differential Evolution on the unlabeled data of an Iranian bank...
متن کاملIntegration of evolutionary computation algorithms and new AUTO-TLBO technique in the speaker clustering stage for speaker diarization of broadcast news
The task of speaker diarization is to answer the question "who spoke when?" In this paper, we present different clustering approaches which consist of Evolutionary Computation Algorithms (ECAs) such as Genetic Algorithm (GA), Particle Swarm Optimization (PSO) algorithm, and Differential Evolution (DE) algorithm as well as Teaching-Learning-Based Optimization (TLBO) technique as a new optimizati...
متن کاملخوشهبندی دادههای بیانژنی توسط عدم تشابه جنگل تصادفی
Background: The clustering of gene expression data plays an important role in the diagnosis and treatment of cancer. These kinds of data are typically involve in a large number of variables (genes), in comparison with number of samples (patients). Many clustering methods have been built based on the dissimilarity among observations that are calculated by a distance function. As increa...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Computer Speech & Language
دوره 23 شماره
صفحات -
تاریخ انتشار 2009